refactor: state schema with per-resource content hashes by dhruva-vapi · Pull Request #19 · VapiAI/gitops

dhruva-vapi · 2026-05-01T20:01:04Z

ELI5

Problem. The state file (.vapi-state.<env>.json) used to map
name → UUID and nothing else. So when push went to update a resource,
the engine had no way to tell whether someone had edited the resource
on the dashboard since you last pulled — there was nothing to compare
to. This is the root cause of "drift detection isn't possible," "real
rollback isn't possible," and "scoped pushes can't be precise about
what they touched": the engine has no per-resource memory of what
was there before.

What this fix does. Widens each state entry from a bare string
(the UUID) to a ResourceState object carrying:

uuid — the platform UUID (unchanged semantics)
lastPulledHash — sha256 of the platform payload at last pull
lastPulledAt — ISO timestamp
lastPushedHash — sha256 of the last pushed payload
platformVersionId — Stack I, populated when platform exposes one

Every state-reading and state-writing call site is updated. No new
external behavior ships in this PR alone — strictly plumbing.
Backwards compatible: legacy state files (the old string shape) load
fine, just without hashes until the next pull/push populates them. The
on-disk file isn't rewritten until the next saveState, so a "deploy
and immediately rollback" doesn't corrupt state.

Outcome you'll notice. This PR alone changes nothing visible. It's
the architectural foundation that drift detection (Stack G), snapshot
rollback (Stack H), and scoped state writes (Stack J) all depend on.
After it lands, your next pull populates lastPulledHash for every
resource, and the next three PRs unlock real safety guarantees.

Architectural pivot. State sections move from Record<string, string>
(name → UUID) to Record<string, ResourceState> carrying:

uuid: string (the platform UUID, unchanged semantics)
lastPulledHash?: string (sha256 of canonicalized platform payload)
lastPulledAt?: string (ISO timestamp)
lastPushedHash?: string (sha256 of last pushed payload)
platformVersionId?: string (Stack I — populated when platform exposes one)

This is the architectural prerequisite for drift detection (Stack G),
snapshot rollback (Stack H), optimistic concurrency (Stack I), and scoped
state writes (Stack J). Every state-reading call site is updated, but
NO new external behavior ships in this PR — strictly plumbing.

Backwards compatibility:

src/state.ts:loadState wraps any legacy bare-string value as
{ uuid: } at load time. Existing customer state files keep
working until their next pull populates hashes. No flag-day migration.
The on-disk file is NOT rewritten until the next saveState, so a
"deploy and immediately rollback" scenario does NOT corrupt state.

Files:

src/types.ts: ResourceState type, StateFile sections retyped.
src/state-serialize.ts: hashPayload (canonicalize + sha256),
asResourceState (legacy migration), upsertState (preserves un-touched
fields when patching).
src/state.ts: stateUuid helper for the common case;
loadState wraps legacy string entries via migrateSection;
re-exports the helpers for ergonomics.
src/pull.ts: each pull populates lastPulledHash + lastPulledAt;
credential entries preserve prior metadata when slug+uuid are stable.
src/push.ts: each PATCH/POST populates lastPushedHash via upsertState.
All state.X[id] reads → ?.uuid. State assignments → upsertState.
src/cleanup.ts, src/credentials.ts, src/delete.ts, src/eval.ts,
src/resolver.ts, src/call.ts: mechanical updates for the new shape.
Verified by tsc — no leaks where a bare string is still expected.
tests/state-migration.test.ts: legacy string entries load and
round-trip; mixed legacy + new entries; canonicalize stability;
hashPayload determinism; upsertState preservation semantics.

Closes improvements.md #4 (architectural prerequisite). G/H/I/J unblocked.

🤖 Generated with Claude Code

dhruva-vapi · 2026-05-01T20:01:15Z

**Problem.** The Vapi API rejects bad configs at PATCH time with terse 400s ("property speed should not exist") — and by then the push has already partially completed against other resources. We watched the same five classes of mistake hit production over and over: 1. Assistant names (or eval names) longer than 40 chars (silent cap). 2. Structured-output ↔ assistant lockstep mismatch — one side declares the relationship, the other doesn't, dashboard ends up inconsistent. 3. Prompts duplicated by paste-on-top dashboard edits (10kB prompt with two identical headers stacked, agent follows both). 4. `maxTokens` set lower than the JSON-schema size of the attached tools' arguments — assistant looks fine on push, bricks on first tool-using call. 5. Voice fields nested wrong for the provider (`voice.speed` on Cartesia, where it lives at `voice.generationConfig.speed`). **What this fix does.** Five client-side validators, all running off the same `LoadedResources` shape that `push.ts` would actually ship — so the lint runs against exactly what would be pushed, no separate parser to drift. Surfaces as warnings by default (one bad spec doesn't block an otherwise-good push); promote to abort with `--strict`. Run standalone via `npm run validate -- <org>`. **Outcome you'll notice.** Most schema-class mistakes get caught locally in seconds instead of mid-push 400s. Voice provider field mismatch gets a specific message pointing at the right path. CI can add `npm run push -- <env> --strict` as a gate before any deploy. --- Catch the classes of errors that today only surface when the API returns a 400 mid-push. The push pipeline runs validation in warn-only mode by default; --strict promotes errors to a blocking abort before any API call. Standalone runner via `npm run validate -- <org>`. Validators implemented: 1. Name length cap (40 chars). Walks every assistant.name and every evaluations[].structuredOutput.name in scenarios. Closes #18. 2. SO ↔ assistant bidirectional lockstep. For every SO file's assistant_ids, checks the named assistant's structuredOutputIds mirrors it; reverse direction too. Closes #11. 3. Prompt duplication heuristics. Same H1 heading appearing twice, repeated CONTINUITY ON ENTRY / CLOSEOUT FLOW STRUCTURE blocks. Partial fix for #8 (paste-on-top dashboard duplications). 4. maxTokens floor for tool-using assistants. Computes floor ≈ 25 + sum(len(JSON.stringify(tool.function.parameters))) per attached tool. Warns under floor. Closes #19. 5. Per-provider voice schema. Cartesia rejects top-level speed / stability / similarityBoost / enableSsmlParsing (point at generationConfig.* / drop the field). 11labs rejects generationConfig (it's a Cartesia path). Closes #9 (engine half). - src/validate.ts (NEW): validateResources(loadedResources) returning ValidationFinding[] with severity / type / resourceId / rule / message / fieldPath. Pure data; safe to test directly. - src/validate-cmd.ts (NEW): CLI entry. Loads same resource shape as push.ts so the lint runs against exactly what would ship. Exit non-zero on any error finding. - src/config.ts: --strict flag. - src/push.ts: validators run in default-warn mode; --strict aborts. - package.json: validate script. - AGENTS.md: document npm run validate and --strict. - tests/validate.test.ts: per-rule fixtures (golden + bad inputs) covering all five checks. Closes improvements.md #11, #18, #19. Resolves engine half of #9. Partial #8, #20 (heuristic only). 🤖 Generated with [Claude Code](https://claude.com/claude-code)

dhruva-vapi · 2026-05-05T02:03:58Z

Merge activity

May 5, 2:03 AM UTC: A user started a stack merge that includes this pull request via Graphite.
May 5, 2:05 AM UTC: Graphite couldn't merge this pull request because a downstack PR feat: validate command with five fail-fast schema/lockstep/shape checks #17 failed to merge.
May 5, 2:15 AM UTC: A user started a stack merge that includes this pull request via Graphite.
May 5, 2:16 AM UTC: Graphite rebased this pull request as part of a merge.
May 5, 2:17 AM UTC: @dhruva-reddy merged this pull request with Graphite.

**Problem.** The Vapi API rejects bad configs at PATCH time with terse 400s ("property speed should not exist") — and by then the push has already partially completed against other resources. We watched the same five classes of mistake hit production over and over: 1. Assistant names (or eval names) longer than 40 chars (silent cap). 2. Structured-output ↔ assistant lockstep mismatch — one side declares the relationship, the other doesn't, dashboard ends up inconsistent. 3. Prompts duplicated by paste-on-top dashboard edits (10kB prompt with two identical headers stacked, agent follows both). 4. `maxTokens` set lower than the JSON-schema size of the attached tools' arguments — assistant looks fine on push, bricks on first tool-using call. 5. Voice fields nested wrong for the provider (`voice.speed` on Cartesia, where it lives at `voice.generationConfig.speed`). **What this fix does.** Five client-side validators, all running off the same `LoadedResources` shape that `push.ts` would actually ship — so the lint runs against exactly what would be pushed, no separate parser to drift. Surfaces as warnings by default (one bad spec doesn't block an otherwise-good push); promote to abort with `--strict`. Run standalone via `npm run validate -- <org>`. **Outcome you'll notice.** Most schema-class mistakes get caught locally in seconds instead of mid-push 400s. Voice provider field mismatch gets a specific message pointing at the right path. CI can add `npm run push -- <env> --strict` as a gate before any deploy. --- Catch the classes of errors that today only surface when the API returns a 400 mid-push. The push pipeline runs validation in warn-only mode by default; --strict promotes errors to a blocking abort before any API call. Standalone runner via `npm run validate -- <org>`. Validators implemented: 1. Name length cap (40 chars). Walks every assistant.name and every evaluations[].structuredOutput.name in scenarios. Closes #18. 2. SO ↔ assistant bidirectional lockstep. For every SO file's assistant_ids, checks the named assistant's structuredOutputIds mirrors it; reverse direction too. Closes #11. 3. Prompt duplication heuristics. Same H1 heading appearing twice, repeated CONTINUITY ON ENTRY / CLOSEOUT FLOW STRUCTURE blocks. Partial fix for #8 (paste-on-top dashboard duplications). 4. maxTokens floor for tool-using assistants. Computes floor ≈ 25 + sum(len(JSON.stringify(tool.function.parameters))) per attached tool. Warns under floor. Closes #19. 5. Per-provider voice schema. Cartesia rejects top-level speed / stability / similarityBoost / enableSsmlParsing (point at generationConfig.* / drop the field). 11labs rejects generationConfig (it's a Cartesia path). Closes #9 (engine half). - src/validate.ts (NEW): validateResources(loadedResources) returning ValidationFinding[] with severity / type / resourceId / rule / message / fieldPath. Pure data; safe to test directly. - src/validate-cmd.ts (NEW): CLI entry. Loads same resource shape as push.ts so the lint runs against exactly what would ship. Exit non-zero on any error finding. - src/config.ts: --strict flag. - src/push.ts: validators run in default-warn mode; --strict aborts. - package.json: validate script. - AGENTS.md: document npm run validate and --strict. - tests/validate.test.ts: per-rule fixtures (golden + bad inputs) covering all five checks. Closes improvements.md #11, #18, #19. Resolves engine half of #9. Partial #8, #20 (heuristic only). 🤖 Generated with [Claude Code](https://claude.com/claude-code)

## ELI5 **Problem.** The state file (`.vapi-state.<env>.json`) used to map *name → UUID* and nothing else. So when push went to update a resource, the engine had no way to tell whether someone had edited the resource on the dashboard since you last pulled — there was nothing to compare to. This is the root cause of "drift detection isn't possible," "real rollback isn't possible," and "scoped pushes can't be precise about what they touched": the engine has no per-resource memory of *what was there before*. **What this fix does.** Widens each state entry from a bare `string` (the UUID) to a `ResourceState` object carrying: - `uuid` — the platform UUID (unchanged semantics) - `lastPulledHash` — sha256 of the platform payload at last pull - `lastPulledAt` — ISO timestamp - `lastPushedHash` — sha256 of the last pushed payload - `platformVersionId` — Stack I, populated when platform exposes one Every state-reading and state-writing call site is updated. **No new external behavior ships in this PR alone** — strictly plumbing. Backwards compatible: legacy state files (the old `string` shape) load fine, just without hashes until the next pull/push populates them. The on-disk file isn't rewritten until the next `saveState`, so a "deploy and immediately rollback" doesn't corrupt state. **Outcome you'll notice.** This PR alone changes nothing visible. It's the architectural foundation that **drift detection (Stack G), snapshot rollback (Stack H), and scoped state writes (Stack J)** all depend on. After it lands, your next pull populates `lastPulledHash` for every resource, and the next three PRs unlock real safety guarantees. --- Architectural pivot. State sections move from Record<string, string> (name → UUID) to Record<string, ResourceState> carrying: - uuid: string (the platform UUID, unchanged semantics) - lastPulledHash?: string (sha256 of canonicalized platform payload) - lastPulledAt?: string (ISO timestamp) - lastPushedHash?: string (sha256 of last pushed payload) - platformVersionId?: string (Stack I — populated when platform exposes one) This is the architectural prerequisite for drift detection (Stack G), snapshot rollback (Stack H), optimistic concurrency (Stack I), and scoped state writes (Stack J). Every state-reading call site is updated, but NO new external behavior ships in this PR — strictly plumbing. Backwards compatibility: - src/state.ts:loadState wraps any legacy bare-string value as { uuid: <string> } at load time. Existing customer state files keep working until their next pull populates hashes. No flag-day migration. - The on-disk file is NOT rewritten until the next saveState, so a "deploy and immediately rollback" scenario does NOT corrupt state. Files: - src/types.ts: ResourceState type, StateFile sections retyped. - src/state-serialize.ts: hashPayload (canonicalize + sha256), asResourceState (legacy migration), upsertState (preserves un-touched fields when patching). - src/state.ts: stateUuid helper for the common case; loadState wraps legacy string entries via migrateSection; re-exports the helpers for ergonomics. - src/pull.ts: each pull populates lastPulledHash + lastPulledAt; credential entries preserve prior metadata when slug+uuid are stable. - src/push.ts: each PATCH/POST populates lastPushedHash via upsertState. All `state.X[id]` reads → `?.uuid`. State assignments → upsertState. - src/cleanup.ts, src/credentials.ts, src/delete.ts, src/eval.ts, src/resolver.ts, src/call.ts: mechanical updates for the new shape. Verified by tsc — no leaks where a bare string is still expected. - tests/state-migration.test.ts: legacy string entries load and round-trip; mixed legacy + new entries; canonicalize stability; hashPayload determinism; upsertState preservation semantics. Closes improvements.md #4 (architectural prerequisite). G/H/I/J unblocked. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

This was referenced May 1, 2026

feat: simulation suite runner (npm run sim) #18

Merged

feat: snapshot-on-push + npm run rollback #21

Merged

feat: drift detection on push (--overwrite to bypass) #20

Merged

refactor: scoped state writes preserve untouched entries #22

Merged

dhruva-vapi force-pushed the dhruva-reddy/feat/sim-runner branch from 8468fc9 to 4e55f1f Compare May 1, 2026 22:56

dhruva-vapi force-pushed the dhruva-reddy/refactor/state-schema-content-hashes branch from c9cf252 to 10c101c Compare May 1, 2026 22:56

dhruva-vapi force-pushed the dhruva-reddy/feat/sim-runner branch from 4e55f1f to 346fbf7 Compare May 2, 2026 01:22

dhruva-vapi force-pushed the dhruva-reddy/refactor/state-schema-content-hashes branch from 10c101c to 1efe5ec Compare May 2, 2026 01:22

dhruva-vapi force-pushed the dhruva-reddy/feat/sim-runner branch from 346fbf7 to 7e5eb7f Compare May 2, 2026 01:27

dhruva-vapi force-pushed the dhruva-reddy/refactor/state-schema-content-hashes branch from 1efe5ec to 92c0312 Compare May 2, 2026 01:28

dhruva-vapi force-pushed the dhruva-reddy/feat/sim-runner branch from 7e5eb7f to ffa91d9 Compare May 2, 2026 01:31

dhruva-vapi force-pushed the dhruva-reddy/refactor/state-schema-content-hashes branch from 92c0312 to d4ac9d6 Compare May 2, 2026 01:32

dhruva-vapi closed this in cd00da7 May 5, 2026

dhruva-vapi reopened this May 5, 2026

dhruva-vapi force-pushed the dhruva-reddy/refactor/state-schema-content-hashes branch from d4ac9d6 to f0fe2ca Compare May 5, 2026 02:14

dhruva-vapi force-pushed the dhruva-reddy/feat/sim-runner branch from ffa91d9 to bfc2bac Compare May 5, 2026 02:14

dhruva-vapi changed the base branch from dhruva-reddy/feat/sim-runner to graphite-base/19 May 5, 2026 02:15

dhruva-vapi changed the base branch from graphite-base/19 to main May 5, 2026 02:16

dhruva-vapi force-pushed the dhruva-reddy/refactor/state-schema-content-hashes branch from f0fe2ca to e944188 Compare May 5, 2026 02:16

dhruva-vapi merged commit d78e655 into main May 5, 2026
1 check passed

dhruva-vapi deleted the dhruva-reddy/refactor/state-schema-content-hashes branch May 11, 2026 20:49

dhruva-vapi mentioned this pull request May 13, 2026

feat: npm run audit — read-only drift detector for state/dashboard divergence #27

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: state schema with per-resource content hashes#19

refactor: state schema with per-resource content hashes#19
dhruva-vapi merged 1 commit into
mainfrom
dhruva-reddy/refactor/state-schema-content-hashes

dhruva-vapi commented May 1, 2026

Uh oh!

dhruva-vapi commented May 1, 2026 •

edited

Loading

Uh oh!

dhruva-vapi commented May 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dhruva-vapi commented May 1, 2026

ELI5

Uh oh!

dhruva-vapi commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dhruva-vapi commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dhruva-vapi commented May 1, 2026 •

edited

Loading

dhruva-vapi commented May 5, 2026 •

edited

Loading